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Norm Monitoring under Partial Action Observability 

Natalia Criado and Jose M. Such, Member, IEEE 


Abstract —In the context of using norms for controlling multi¬ 
agent systems, a vitally important question that has not yet been 
addressed in the literature is the development of mechanisms for 
monitoring norm compliance under partial action observability. 
This paper proposes the reconstruction of unobserved actions to 
tackle this problem. In particular, we formalise the problem of 
reconstructing unobserved actions, and propose an information 
model and algorithms for monitoring norms under partial action 
observability using two different processes for reconstructing 
unobserved actions. Our evaluation shows that reconstructing 
unobserved actions Increases significantly the number of norm 
violations and fulfilments detected. 

Index Terms —Norm Monitoring, Action Observability. 


I. Introduction 

Within the Multi-agent System (MAS) area, norms are 
understood as means to coordinate and regulate the activity 
of autonomous agents interacting in a given social context 
CSl. The existence of autonomous agents that are capable 
of violating norms entails the development of norm control 
mechanisms that implement norms in agent societies. 

In the existing literature, several authors have proposed 
infrastructures to observe agent actions and detect norm viola¬ 
tions upon them m, ED. The majority of these proposals have 
focused on providing efficient and scalable methods to monitor 
norms in dynamic agent societies, but they assume that all 
actions of agents are observable. However, this assumption is 
too strong because it is not necessarily true that all actions to 
be controlled can always be observed. One reason for this is 
that observing actions usually entails high costs. For example, 
the costs of setting, maintaining, and managing traffics radars 
to detect car speeds are very high, so traffic authorities usually 
decide to install a few of them in specific and critical locations. 
Another reason is that illegal actions may take place outside 
the institution controlled by the monitor; however, the effects 
of these actions can still be detected within the institution. 
For example, black market transactions cannot be directly 
observed by legal authorities, yet the corresponding money 
laundering transactions can be detected and sanctioned by 
these authorities. 

Very recent work on norm monitoring under partial action 
observability proposes solutions to ensure complete action 
observability by increasing the actions that are observed, either 
by adding more monitors 0 or by adapting the norms to 
what can be observed El- However, these solutions are not 
always appropriate or feasible. For instance, in e-markets, such 
as e BA^ or Amazor0 it is not possible to change trading 

N. Criado is with the School of Computing and Mathematical Sciences, 
Liveipool John Moores University, UK, e-mail: n.criado@ljmu.ac.uk. 

J. M. Such is with the School of Computing and Communications Info- 
lab21, Lancaster University, UK, email:].such@lancaster.ac.uk. 

' http://www.ebay.com 

^ http://www.amazon.com 


laws to what can be observed. This paper goes beyond these 
approaches by also considering actions that were not observed 
but that can be reconstructed from what was observed. 

The main contributions of this paper are: (i) a formalisation 
of the problem of reconstructing unobserved actions from 
observed actions for the purpose of norm monitoring; (ii) an 
exhaustive and an approximation solution to this problem; and 
(iii) an information model and algorithms used to monitor 
norms under partial action observability. Through an extensive 
empirical evaluation, we show that reconstructing unobserved 
actions increases noticeably the number of norm violations 
and fulfilments detected. 

This paper is organised as follows: Section |I^ contains the 
preliminary dehnitions used in this paper. Section[nI|describes 
the information model of norm monitor proposed in this 
paper. Section contains the algorithms executed by norm 
monitors. Our proposal is evaluated in SectiorjV] Related word 
is discussed in Section VI Finally, conclusions are contained 
in Section Ivnl 


H. Preliminary Definitions 

£ is a hrst-order language containing a hnite set of predicate 
and constant symbols, the logical connective the equality 
(inequality) symbol = (7^), the true (T) and false propositions 
(_L), and an infinite set of variables. The predicate and con¬ 
stant symbols are written as any sequence of alphanumeric 
characters beginning with a lower case letter. Variables are 
written as any sequence of alphanumeric characters beginning 
with a capital letter. We also assume the standard notion of 
substitution of variables GD; i.e., a substitution cr is a hnite 
and possibly empty set of pairs Y/y where V is a variable 
and y is a term. 

The set of grounded atomic formulas of C is built of a hnite 
set of predicates and objects that characterise the properties 
of the world relevant to norm monitoring. By a situation, 
we mean the properties that are true at a particular moment. 
Some of these properties are static and not altered by action 
execution, whereas other properties are dynamic and changed 
due to ^ent actions. Specihcally, we represent static properties 
as a sejj of atomic grounded formulas of L, denoted by g. A 
state s is a set of grounded atomic formulas of L, describing 
dynamic properties which hold on state s. Thus, a situation is 
built on a “closed assumption” and dehned by a set of static 
properties g and a state s. Moreover, there is a set of inference 
rules (V) representing domain knowledge. 

Example. In this paper we will use a running example 
in which there are three robots that should attend requests at 
six ojfices in a building. The goal of the robots is to attend 
these requests as soon as possible. Figure \ra\ depicts our initial 

^In this paper sets are to be interpreted as the conjunction of their elements. 
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(a) Initial State so (b) State si (c) State S2 

Fig. 1: Example Scenario. Offices are represented by squares, 
agents are represented by circles and the cotTidors are rep¬ 
resented by arrows. Black arrows correspond to corridors ob¬ 
served by the Norm Monitor (NM) and grey arrows correspond 
to corridors not observed by the NM. 

scenario. In our example, the language C contains: 4 pred¬ 
icate symbols (robot, office, in, corridor), used to represent 
the robots and offices, the positions of the robots and the 
connections between offices in the building; 3 constant symbols 
to represent the robots (rl,r2,r3); and 6 constant symbols 
to represent the offices (a, b, c, d, e, f). The information about 
the robots, offices and corridors between offices is static and 
represented as follows: 

g = {roboti^rl), robot{r2),robot{r3), office(^a),..., office{f), 
corridor{a, b),corridor{b, a ),..., corridorie, a)} 

The information about the location of the robots is dynamic. 
Specifically, the initial state sq is defined as follows: 

Sq = {m(rl, a),in{r2, d), infr‘3, e)} 

In this domain there is an inference rule (V) representing that 
a robot cannot be in two different offices at the same time: 

V = {{in{Rl, OA), in{Rl, OB), OA ^ OB} h _L} 


A. Action Definitions 

is a finite set of action descriptions that induce state 
transitions. An action description d is represented using pre¬ 
conditions and postconditions. If a situation does not satisfy 
the preconditions, then the action cannot be applied in this 
situation. In contrast, if the preconditions are satisfied, then 
the action can be applied transforming the current state into 
a new state in which all negative literals appearing in the 
postconditions are deleted and all positive literals in the 
postconditions are added. Moreover, actions are executed in a 
MAS and, as a consequence, we need to be able to represent 
concurrent actions with interacting effects. For the sake of 
simplicity, we will represent concurrent actions without an ex¬ 
plicit representation of tim^as proposed in Bl. The main idea 
beyond this representation is that individual agent actions do 
interact (i.e., one action might only achieve the intended effect 
if another action is executed concurrently). Specifically, each 
action is also represented by a (possibly empty) concurrent 

^An explicit representation of time may play a role on other problems like 
scheduling concurrent actions, but is not strictly necessary for monitoring the 
effects of interaction. 


condition that describes the actions that must (or cannot) be 
executed concurrentl}0 

Definition 1. An action description d is a tuple {name, 
pre, con, post) where: 

• name is the action name; 

• pre is the precondition, i.e., a set of positive and negative 
literals of C (containing both dynamic and static prop¬ 
erties) as well as equality and inequality constraints on 
the variables; 

• con is the concurrent condition; i.e., a set of positive and 
negative action schemati^ some of which can be partially 
instantiated or constrained; 

• post is the postcondition; i.e., a set of positive and 
negative literals of C (containing dynamic properties 
only). 

Given an action description d, we denote by pre{d), con{d), 
post{d) the action precondition, concurrent condition and 
postcondition. 

Example. In our example, there is only one action that 
can be executed by robots: 

move, {robot{R), office{0\), office{02), in{R, Ol), \ 
corridor{01, 02)}, {}, {^in{R, Ol), in{R, 02)} j 

This action represents the movement of a robot from one office 
to another. The parameters of this action are the robot (R), 
the source office (Ol), the destination office (02). To execute 
this action, the robot should be located at the source office 
and the two offices should be connected. Once the operation 
has been applied, the robot is no longer at the source office 
and it is at the destination office. 

Definition 2. Given a situation represented by the state s and 
a set of static properties g, and an action description d = 
{name, pre, con, post); an action instance (or action) is a 
tuple {name, pre', con', post') such that: 

• There is a substitution a of variables in pre, such that the 
precondition is satisfied (i.e., entalied by) the situation; 
i.e., s,g\- a ■ pre; 

• a ■ pre, a ■ post are grounded; 

• pre' is a set of grounded literals in a ■ pre containing 
dynamic properties only; 

• post' = a ■ post and con' = a ■ con. 

Given an action a, we denote by actor{a) the agent 
performing the action, and by pre{a),con{a),post{a) the 
precondition, concurrent condition and postcondition. 

Example. In state Sq, the robot rl moves from office a to 
office b. This is formalised as follows: 

move, {robot{rl), office{a), office{b), in{rl, a), \ 
corridor(a, b)}, {}, {^in{rl, a), in{rl, b)} j 

^A more sophisticated definition of the concurrent condition would allow 
actions to have conditional effects according to the actions that are executed 
concurrently. Without loss of expressiveness, we will not consider conditional 
effects in action descriptions (note that any action with conditional effects can 
be represented by a set of actions with non conditional effects). 

®An action schema contains an action name and the parameters of this 
action. Note that positive action schemata are implicitly existentially quantified 
-i.e., one instance of each positive schema must occur concumently- and 
negative schemata are implicitly universally quantified. 
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In a MAS, concurrent action^ define state transitions. More 
formally, a concurrent action A = {ai,...,a„} is a set of 
individual actions. Given a set of actions A = {ai,...,a„}, 
we define pre{A) = [Jpre{ai), post{A) = y}post{ai) and 
actor[A) = IJacfor(ai). 

Given a concurrent action A = {oi,..., a„} we say that the 
concurrent condition of an individual action of A is satisfied 
when for all positive schema in the concurrent condition exists 
an action aj {i j) in A, such that aj is an instance of the 
schema; and for all negative schema none of the elements in 
A is an instance of the schema. For the sake of simplicity, we 
assume that each agent performs one action at a tim^ 

Definition 3. (Consistency Given a concurrent action 

A = {ai,Un} it is consistent if: 

• pre{A) is consistent (i.e, pre{A) \f _Lj; 

• post{A) is consistent (i.e, post(A) \f _Lj; 

• the concurrent condition of each action is satisfied; 

• the concurrent action is complete (i.e., each agent per¬ 
forms one action in A). 

Example. The concurrent action A = {move(rl, a,b), 
move(r2, d, a), move{r3, e, a)}|^ A consistent since: 

• pre(A) = {m(rl, a), m(r2, d), m(r3, e)} which is con¬ 
sistent; 

• post(A) = {m(rl, 6), m(r2, a), in(r3, a),-'m(rl, a), 
-'m(r2, d),-'fn(r3, e)} which is consistent; 

• the concurrent conditions of both actions are satisfied; 

• each robot perfonns one action. 

A concurrent action A — {ai,...,a„} is applicable in a 
situation if A is consistent and each individual action Ui G A 
is applicable in this situation. 

Given a consistent action, we define its effects as the 
postconditions of its individual actions and the preconditions 
not invalidated by the postconditions. More formally, given 
a concurrent action A = {ai,...,a„} its effects are a set of 
grounded literals as follows: 

eff{A) = { U pre)[j{ [J post) 

ypreGpre{A): ypostGpost{A) 

pre,post{A)\/l. 

B. Norm Definitions 

We consider norms as formal statements that define patterns 
of behaviour by means of deontic modalities (i.e., obligations 
and prohibitions). Specifically, our proposal is based on the 
notion of norm as a conditional rule of behaviour that defines 
under which circumstances a pattern of behaviour becomes 
relevant and must be fulfilled im, m, m, ini. 

Definition 4. A norm is defined as a tuple {deontic, 
condition, action), where: 

• deontic G {0,V} is the deontic modality of the norm, 
determining if the norm is an obligation (O) or prohibi¬ 
tion (V); 

^Concurrent action means actions that occur at the same time and does not 
necessarily imply agent cooperation or coordination. 

*This limitation can be relaxed by decomposing agents into groups of agents 
corresponding to agents’ actuators (4|. 

®For simplicity, we represent actions by their schemata. 


• condition is a set of literals of C as well as equality and 
inequality constraints that represents the norm condition, 
i.e., it denotes the situations in which the norm is relevant. 

• action is a positive action schema that represents the 
action controlled by the norm. 

Example. In our example, there is a norm that avoids 
collisions by forbidding any robot to move into an office when 
the office is occupied by another robot: 

{V, L2), move{R2, LI, L2)) 

This norm states that when a robot R1 is located in office 02 
other robots are forbidden to move from any office LI to L2. 

In line with related literature m, EQi, 0, we consider a 
closed legal system, where everything is considered permitted 
by default, and obligation and prohibition norms define ex¬ 
ceptions to this default permission rule. We also define that a 
norm is relevant to a specific situation if the norm condition 
is satisfied in the situation. Besides, we define that a norm 
condition is satisfied in a given situation when there is a 
substitution of the variables in the norm condition such that the 
constraints in the norm condition are satisfied and the positive 
(vs. negative) literals in the norm condition are true (vs. false) 
in the situation. 

Definition 5. Given a specific situation denoted by a state s 
and a set of static properties g, and a norm {deontic, 
condition, action); a norm instance is a tuple 
{deontic, action') such as: 

• There is a substitution a such that the condition is 
satisfied in the situation; i.e., s,g\- a ■ condition; 

• action' = a • action. 

Example. In state sq the norm that forbids robots to move 
into occupied offices is instantiated as follows: 

{V,move{R2,Ll,d)) where a = {L2/(i} 

{V, move{R2, LI, a)) where a = {L2/a} 
{V,move{R2, Ll,e)) where a = {L2/e} 

The semantics of instances (and norms in general) depends 
on their deontic modality. An obligation instance is fulfilled 
when the mandatory action is performed and violated other¬ 
wise, while a prohibition instance is violated when the forbid¬ 
den action is performed and fulfilled otherwise. We classify 
detected violations (vs. fulfilments) into: identified violations 
(vs. fulfilment), which refers to when the monitor knows the 
specific action that an agent executed and violates (vs. fulfils) 
an instance; and discovered violations (vs. fulfilment), which 
refers to when the monitor knows that an agent violated (vs. 
fulfilment) some instance but does not know the forbidden (vs. 
mandatory) action executed by the agent. 

III. NM Information Model 

Let us assume a set of agents Ag to be monitored, a set 
of norms N that regulate the actions of agents, and a set D 
of action descriptions that represent the actions that can be 
performed by agents. Eor the sake of simplicity, we assume 
that there is a single Norm Monitor (NM) that observes the 
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actions performed by agents and monitors norm complianc^^ 
We also assume that actions are deterministic and that the 
current state evolves due to action execution onl}p] The 
goal of the NM is to analyse a partial sequence of action 
observations to detect norm violations. The enforcement of 
norms is out of the scope of this work and we assume that once 
the NM detects a norm violation (vs. fulfilment), it applies the 
corresponding sanction (vs. reward). 

A. State Representation 

As the NM only observes a subset of the actions performed 
by agents, it has partial information about the state of the 
world. The NM represents each partial state of the world, 
denoted by p, using an “open world assumption” as a set 
of grounded literals that are known in the state. Thus, a 
partial state contains positive (vs. negative) grounded literals 
representing dynamic properties known to be true (vs. false) 
in the state. The rest of dynamic properties are unknown. 

To begin with, assume that the NM monitor has complete 
knowledge of the initial state (this will be relaxed later). Thus, 
at to the NM knows which grounded atomic formulas are true 
or false in the initial state (po = sq). From that moment on, 
the NM monitors the actions performed by agents at each 
point in time. At time to the NM carries out a monitoring 
activity and observes some of the actions performed by agents 
(Acto)- These actions have evolved sq into a new state si. 
As previously mentioned, the NM has limited capabilities for 
observing the actions performed by agents. Thus, it is possible 
that the NM observes a subset of the actions performed 
by agents. Specifically, if all actions have been observed 
(|Acfo| = \Ag\), then the resulting partial state pi can be 
constructed by considering the effects of actions in Acto on 
Po so Pi = Si. A different case arises when the NM observes a 
subset of the actions performed by the agents (|Acfo| < 

In this case, the agent cannot be sure about the effects of un¬ 
observed actions. Thus, the new partial state pi is constructed 
by assuming that the postconditions of the observed actions 
must hold on state si (i.e., positive postconditions are positive 
literals in pi and negative postconditions are negative literals 
in pi) and the rest of dynamic propositions are unknown. If 
the NM takes into account the next sequence of actions that 
it observes at time ti (Acti), then the NM can also infer that 
the preconditions of these actions must hold on state si, and, 
as a consequence, new propositions can be taken for sure in 
the partial state pi, retrospectively. Partial states in the general 
case are defined as: 

Definition 6. Given a partial state description pt correspond¬ 
ing to time t, and two consecutive sequences of observed 
actions Acf and Acf+i executed by agents at times t and 
t respectively; the new partial state Pt+i resulting from 
executing actions Acf in pt and actions Acf+i in pt+i is 
obtained as follows: 

post{Actt)[Jpre{Actt+i) if \Actt \ < \Ag\ 

Pt U ejf{Actt) U pre{Actt+i) otherwise 

^^However, our model can be used by a team of monitors as well. 

*^This assumption could be relaxed if NMs have capabilities for observing 
both state changes and actions. 


where pl is the set of invariant literals; i.e., literals ofpt that 
have not been modified by the actions in Actt and it is defined 
as follows: 

U ' 

yiept- 

Example. In our example, the NM knows which grounded 
atomic formulas are true or false in the initial state: 

Po = {in{rl, a), Mn(rl, b), c),^in{rl, d), Mn{rl, e), 

-^in{rl, /), in(r2, d), -^in{r2, a), -^in{r2, b),^in(r2, c), 
Mn(r2, e), -^inlr2, f),in{r3, e), —^inlrZ, a), -^in(r3, b), 
Mn(r3, c), -^in{r3, d), Mn{r3, /)} 

The NM has some surveillance cameras to monitor the move¬ 
ment of robots in the building. Specifically, the corridors that 
are monitored are the ones between offices: a and b; b and c; 
and b and /. These corridors are represented by black arrows 
in Figure whereas non-monitored corridors are represented 
by grey arrows. In the initial state (sq) depicted in Figure [ 7 ^ 
the robots execute the actions move(rl,a,b),move{r2,d,a) 
and move{r3,e,a) resulting in a new state (si) depicted 
in Figure However, the NM only observes the action of 
robot rl, because this action takes place in a monitored 
corridor; i.e., AcIq = {move(rl, a,b)}. In the next state 
si, the robots execute actions move(rl,b,c),move{r2,a,e) 
and move{r3, a,b) resulting in a new state (S 2 ) depicted in 
Figure In this case the NM observes two actions; i.e., 
Acti — {move(rl,b,c),move{r3, a,b)}. Considering these 
two sets of observed actions the NM is able to infer the 
dynamic propositions that are known in si as follows: 

Pi = {m(rl, b), -^in{rl, a),in{r3, a)} 

If the NM uses the information about the states and the 
observed actions, then no violation of the norm is detected and 
no robot is sanctioned. However, r2 and r3 have violated the 
norm, since they have moved into an occupied office through 
non-monitored corridors. 

B. Action Reconstruction 

NMs use Definition to generate partial state descriptions 
based on the observed actions. Additionally, we propose that 
NMs reconstruct the actions that have not been observed. This 
reconstruction process entails: (i) searching for the actions that 
have been performed by unobserved agents; and (ii) using the 
actions found to increase the knowledge about the state of 
the world. The reconstruction process must be sound, e.g., 
it cannot indicate that a violation has occurred when it has 
not in fact occurred. In the following, we introduce full and 
approximate methods for reconstructing unobserved actions. 

1) Full Reconstruction: Full reconstruction tries to find 
exhaustively the actions performed by all the agents that 
have not been observed. To this aim, the full reconstruction 
performs a search to identify all solutions to the reconstruction 
problem. 
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Definition 7. Given a partial state description pt correspond¬ 
ing to time t (named initial state), a set of observed actions 
Actt at time t, and an partial resulting state Pt+i correspond¬ 
ing to time t + 1 (named final state); we define search as a 
function that computes sets of solutions S = {51, Sk} such 
that each solution Si in S is a set of actions such that: 

• the concurrent action Si U Actt consistent; 

• the initial state induced by the concurrent action SiUActt 
is consistent (i.e., g,pt,pre{Si U Actt), V f A); 

• the final state induced by the concurrent action Si U Actt 
is consistent (i.e., g,pt+i,post(Si U Actt),V \f _Lj. 

Thus, a solution is a set of actions performed by the agents 
that have not been observed^ that are consistent with the states 
of the world before and after the execution of the actions. 
Given that the NM has a partial knowledge of the states, we 
do not require that the preconditions (vs. postconditions) of 
actions in a solution are met in the initial (vs. final) state, 
since it is possible that the preconditions (vs. postconditions) 
are true, but the NM is unaware of it. 

Example. Given the partial state description p^, the 
set of observed actions Act^, and the partial resulting state 
Pi, the search function looks for actions of agents r2 and 
r3 (since they are the agents that have not been observed). 
According to the initial position of r2, the NM can infer that r2 
may have performed two different actions move(r2, d, a) and 
move{r2, d, e) —these two actions are the only ones consistent 
with Pq. Similarly, the NM can infer that r3 may have per¬ 
formed three different actions move{r3,e,a), move(r3,e,d) 
and move(r3, e, f) —these three actions are the only ones 
consistent with po- However, the actions move{r3, e, d) and 
move{r3, e, f) are not consistent with the final state —recall 
that these two actions have as postcondition the fact that r3 
is in offices d and f, respectively; that pi defines that r3 is in 
office a; and that V defines as inconsistent states where any 
robot is at two different locations. Ai a result, the solution set 
for this problem is defined as: 

S — {{move{r2, d, a),move{r3, e, a)}, 

{move{r2, d, e), move{r3, e, a)}} 

Once all solutions are found, the NM uses this information 
to extend the information about the actions performed by 
unobserved agents and the state of the world. To ensure that 
the reconstruction is sound, the NM calculates the intersection 
of actions in the solutions to select actions it is completely sure 
about (i.e., actions belonging to all solutions). Given a set of 
search solutions S = {5i,...,5fe} for some initial and final 
states, we define the reconstruction action set as follows: 

R= f] S, 

ySiGS 

If i? ^ 0, then the NM expands its knowledge about the 
actions performed by agents and it uses this information to 
increase the knowledge about the initial and final states. More 
formally, the set of actions observed in t is updated as: 

Actt = Actt U R 

*^If all actions were observed, no reconstruction would be needed. 


The initial state pt is updated as follows: 

Pt = Pt[Jpre{R) 

Finally, the final state is updated as follows: 

Pt+i\JpostiR)iJpt if < \Ag\ 

Pt+i u Pt [J eff (Actt) otherwise 

where pi is defined as before and p* is the set of extended 
invariant literals; i.e., literals in pt that have not been modified 
since there is not a solution Si G S such that the concurrent 
action Si U Actt changes any of these literals: 

p := U ' 

yiept- 

(iSiCiS:l,post{ActtUSi)^l. 

Example. The reconstruction set for the example is: 

R = {move{r3, e, a)} 

This action belongs to all solutions, so the NM can be 
absolutely sure about the performance of this action, even 
when the NM has not observed it. Ai a consequence, the NM 
extends its information as follows: 

Actg = {move{rl, a, b), move{r3, e, a)} 
and po remains unchanged and pi is updated as follows: 

Pi = {m(rl, b), a), -'zn(rl, c), d), ^m(rl, e), 

—•in(rl, /), —>in{r2, b), —'in(r2, c), ^in{r2, f),in{r3, a), 
Mn(r3, b), -^in{r3, c),^in{r3, d), —'in{r3, e), -^in{r3, /)} 

The main disadvantage of full reconstruction is that, for 
many real-world problems, the number of candidate solutions 
that needs to be explored is prohibitively large, as shown 
later in Section In response to this problem, we provide a 
polynomial approximation below. 

2) Approximate Reconstruction: Approximate reconstruc¬ 
tion includes an approximate search that finds the actions 
performed by unobserved agents that are consistent with 
the states of the world before and after action execution. 
Specifically, approximate reconstruction identifies actions that 
do not necessarily include the specific actions performed by 
unobserved agents but that allow the NM to control norms. 
The main intuition beyond approximate reconstruction is as 
follows: imagine that at a given initial state an agent can 
perform just one action and that this action is forbidden (vs. 
mandatory). In this case, the NM identifies that the agent 
has violated (vs. fulfilled) a norm. Besides that, if an agent 
can perform n different actions and all these actions are 
forbidden (vs. mandatory), the NM does not need to know 
which action has been executed to conclude that a norm has 
been violated (vs. fulfilled|^ Hence, we say that a violation 
has been discovered (instead of identified). Given a set of 

*^Note that the propose of this paper is to monitor norms, not to determine 
whether agents are responsible for norm violations/fulfilments. Monitoring 
situations where agents can only execute forbidden/obligatory actions can 
help to detect norm-design problems. Additionally, the fact that an agent can 
only execute forbidden actions may be explained by the agent putting itself 
into these illegal situations (e.g., I am allowed to overtake but overtaking may 
put me in a situation where I can only exceed the speed limit). 
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prohibition instances P and an action a, we define that the 
action a is forbidden (denoted by forbidden{P, a)) when 
3p G P '■ 3(7 '■ a ■ action{p) = a. Similarly, given a set of 
obligation instances O and an action a, we define that the 
action a is mandatory (denoted by mandat or y{0, a)) when 
3o G O : 3a : a • action{o) = a. 

Definition 8. Given a partial state pt, a set of observed actions 
Actt at time t, and a partial resulting state Pt+it we define 
approximate search as a function that calculates the set of all 
unobserved applicable actions S = {oi, such that: 

• the preconditions of each action in S are consistent with 
the initial state (i.e., Voi G S : g,pt,pre{ai), V 1/ _Lj; 

• the postconditions of each action in S are consistent with 
the final state (i.e., Va^ £ S : g,pt+i,post{ai), V 1/ _Lj; 

• actions in S are performed by unobserved agents (i.e., 

actor{S) n actor{Actt) = 0j; _ 

• all unobserved agents perform at least one action in S . 

Example. Given the partial state description pq, the set 
of observed actions ActQ, and the partial resulting state pi, 
the approximate search function looks for actions of agents r2 
and r3 (since they are the agents that have not been observed). 
According to the initial position of r2, the NM can infer that 
r2 may have performed two different actions move{r2,d,a) 
and move{r2,d,e). Again, r3 may have performed action 
move{r3, e, a). The approximate solution for this problem is 
defined as: 

S = {move{r2, d, a), move{r2, d, e), move{r‘i, e, a)} 

As in full reconstruction, the NM uses approximate search 
solutions (S) to expand its knowledge about the actions per¬ 
formed by unobserved agents and to increase the knowledge 
about the initial and final states. When an unobserved agent 
may have executed only one action, then the NM knows 
for sure that this action was executed. More formally, the 
reconstruction action set is defined as follows: 

R= U a 

VaGS-.^aGS: 
a^a' Aactor{a)—actor{a') 

The set of actions observed in t is updated as: 

Actt = Actt U R 

Then the initial state pt is updated as follows: 

Pt = Pt [Jpre{R) 

The final state is updated as follows: 

Pt+i[jpost{R)[jp^ if \Actt \ < \Ag\ 

Pt+i U Pt U eff{Actf) otherwise 

where fil is defined as before and pi is the set of extended 
invariant literals in pp, i.e., literals that have not been modified 
since there is not an observed action or an applicable action 
that changes them: 

u u ^ 

/ V ViGpt: / 

l.,post(Actt)'t/± l,post{S)\y± 


Finally, the set of discovered violations and fulfilments is a 
set of actions defined as follows: 

D = {Oi, ..., Qj} 

where for each action in D: at is in S and the agent that 
performs at (i.e., actor{ai) ) is: 

• able to execute more than one action (i.e., 3aj G S : 

Qi 7 ^ Gj A actori^Qi) = actor{aj))\ 

• only able to execute forbidden (vs. mandatory) actions 
and Qi is one of these forbidden (vs. mandatory) actions; 

When an agent is only able to perform forbidden (vs. manda¬ 
tory) actions, an action among these can be selected according 
to various criteria. For example, in a normative system where 
the presumption of innocence principle holds, the NM should 
assume that the agent has violated (vs. fulfilled) the least (vs. 
most) important norm and the action that violates (vs. fulfils) 
this norm is selected. Note discovering violations is very useful 
in many practical applications, in which it would allow the NM 
to ban offender agents (e.g.. Intrusion Detection/Prevention 
Systems 13), to stop the execution of any offender agent 
(e.g.. Business Process Compliance monitoring lIZTl l. or to 
put offender agents under close surveillance (e.g., Model- 
Based Diagnosis Systems ll22ll l. even when the specific action 
performed ins not known. 

Example. In case of the approximate reconstruction, r3 
is only able to perform one action, which entails that the NM 
can be absolute sure about the performance of this action and 
the reconstruction set is defined as: 

R = {TOOue(r3, e, a)} 

Ai a consequence, the NM extends its information as follows: 

Acto = {move{rl, a, b), move{r‘i, e, a)} 

Pq remains unchanged and pi is updated as follows: 

Pi = {m(rl, b), Mn(rl, a), -^in(rl, c),^in(rl, d), Mn(rl, e), 
^in(rl, /), —^in{r2, b), —'in(r2, c), -^in(r2, f),in(r3, a), 
^in(r3, b), -^in(r3, c),Mn(r3, d), Mn(r3, e), ^m(r3, /)} 

In this situation, r2 is only able to execute forbidden ac¬ 
tions —recall that the instances {V,move{R2,Ll,a)) and 
{V,move{R2,Ll,e)) forbid any robot to move into offices 
a and e and that r2 may have been executed actions 
move(r2, d, a) and move(r2, d, e). Thus, the set of discovered 
violations and fulfilments is defined as follows: 

D = {move{r2, d, e)} 

note that the discovered violation does not correspond to the 
action executed by r2, however, it allows the NM to determine 
that r2 must have violated an instance. 

C. Norm Monitoring 

Once all the information about the actions performed by 
the agents and the partial states has been reconstructed, the 
NM checks the actions of agents to determine which instances 
have been violated or fulfilled. Recall that norms in our 
model are defined as conditional rules that state which actions 
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are obligatory or forbidden. Given that the NM has partial 
knowledge about the state of the world, the NM should 
control norms only when it is completely sure that the norms 
are relevant to ensure that the norm monitoring process is 
sound. In particular, we define that a norm is relevant to a 
partial situation when the norm condition is satisfied by the 
partial situation —i.e., a norm {deontic, condition, action) is 
relevant to a partial situation represented by a partial state p, 
the static properties g and the domain knowledge V if Btr such 
that p, g,V ha- condition. 

Example. In state Po the norm that forbids robots to move 
into occupied offices is instantiated three times as follows: 

{V,niove{R2, Ll,d)) where a = {L2/d} 
{V,move{R2,Ll,a)) where cr = {L2/a} 
{V,move{R2, Ll,a)) where a = {L2/e} 

Once the NM has determined which norm instances hold 
in a given situation, it has to check the actions of agents to 
determine which instances have been violated and which ones 
have been fulfilled. 

Obligation Instance. In presence of partial knowledge about 
the actions performed by agents, the NM can only determine 
that an obligation instance has been fulfilled. If the NM knows 
all the actions performed by agents, then it can determine 
whether an obligation has been fulfilled or violated. 

Definition 9. Given an obligation instance {O, action') and 
a set of observed actions Act, then the obligation is defined 
as: 

{ fulfilled iff 3a : a ■ action' G Act 

violated iff {^a : a ■ action' € Act) A \ Acf\ = \ Ag\ 
unknown otherwise 


Prohibition Instance. In presence of partial knowledge about 
the actions performed by agents, the NM can only determine 
that a prohibition instance has been violated. If the NM knows 
all the actions performed by agents then it can determine 
whether a prohibition has been fulfilled or violated. 

Definition 10. Given a prohibition instance {V, action') and 
a set of observed actions Act, then the prohibition is defined 
as: 

{ violated iff 3a : a ■ action' G Act 

fulfilled iff (^cr : cr • action' G Act) A \ Act\ = \ Ag\ 
unknown otherwise 

Finally, the set of discovered violations and fulfilments is 
used to identify those agents that have violated or fulfilled an 
instance. 

Example. Taking into account the set of actions Acto, 
the NM can identify that robot r3 has violated the in¬ 
stance {V,niove{R2, Ll,a)), even though this forbidden ac¬ 
tion has not been observed by the NM. Specifically, there is 
a = {i?2/r3, Ll/e} such that a{niove[R2,Ll,a)) G Acto. 
Besides that, the approximate reconstruction discovers that 
robot r2 has violated a prohibition instance though it doe 
snot know the exact action performed—recall that D = 


{move{r2, d, e)}. Had the NM not performed the proposed 
reconstruction processes, none of these violations would have 
been detected. 


IV. NM Algorithms 


Algorithmic contains the NM pseudocode. In each step, the 
NM observes the actions of agents and uses this information 
to update the current and the previous partial states (lines 4- 
9). If all the actions have not been observed in the previous 
state, then the NM executes the reconstruction function 
to reconstruct unobserved actions (lines 11-14). Then, the 
check Norms function is executed to determine which norms 
have been violated and fulfilled in the previous state (line 15) 
according to Definitions and 10 


Note that the NM code can be executed while actions are 
performed without delaying agents. Regarding the temporal 
cost of the algorithm executed by NMs, it is determined by 
the cost of the reconstruction function, the implementations 
of which (full and approximate) are discussed below. 


Algorithm 1 NM Algorithm 

Require: Ag,N,D,V,g 

1: Po = 0 > Po is an empty conjunction of literals 

2: f ^ 0 

3: while true do 

4: Actt observeActionsQ 

5: if \Actt\ < \Ag\ then 

6: pt+i ^ post{Actt) 

7: else 

8: pt+i -i- pI f\ eff (Actt) 

9: pt ■(- pt Apre{Actt) 

10: if f > 0 then 

11: if \Actt-i\ < \Ag\ then 

12: TA Ag \ actors{Actt-i) > Target Agents 

13: D 0 [> Discovered violations and fuflilments 

14: reconstruction{pt-i,pt, Actt-i,TA, D) 

15: checkNarms{pt-i, Actt-f) 

16: t-^t+1 


Full Reconstruction (Algorithm This pseudocode corre¬ 
sponds to the full reconstruction function. This function calls 
the function search to search the actions of target agents (line 
2). Then, for all the solutions found, the NM checks if they 
are consistent according to Definition (lines 4-6). Finally, 
consistent solutions are used to extend the set of observed 
actions and the knowledge about the initial and final states 
(lines 7-14). The temporal cost of this algorithm is given by 
the cost of the search function discussed below. 

Algorithm contains the pseudocode of the recursive 
search function that computes all the sequences of consistent 
actions that may have been executed by the agents that have 
not been observed. It starts by checking that there is at 
least one target agent (line 2). If so, it identifies all actions 
that might have been executed by one target agent (lines 3- 
4). An action might have been executed if it is consistent 
according to the static properties, the domain knowledge, 
and the initial and final states. For each consistent action, it 
reconstructs the actions of the remaining agents recursively 
(lines 5-13). In the worst case, the temporal cost of this 







Algorithm 2 Full Reconstruction Function 
1: function FULLRECONSTRUCTION(i, /, Act, TA, D) 

2: s' search{i, f, Act, TA) > Candidate Solutions 

3: 5 -t— 0 > Consistent Solutions 

4: for aU Sj G S' do 

5: \t checkSolutionConsistency{Act, Sj,i, f) then 

6: <S <S U S'j 

7: R <r- Hvs; e5 

8: if 7? 7 ^ 0 then 

9: Act t— Act U R 

10: ii l\pre{R) 

11: if \Act\ < \Ag\ then 

12: f ^ f/\post[R) l\i' 

13: else 

14: f ^ i*/\f AejJiAct) 


function is where Ag is the set of agents, 

D is the set of action descriptions and Id is the maximum 
number of instantiations per action. This situation arises when 
no action is observed and all actions are applicable for all 
agents. 

Algorithm 3 Search Function 
1: function SEARCHti, f,Act,TA) 

2: if TA / 0 then 

3: for aU d G T do t> Identify consistent actions 

4: if 3(t : checkActionConsitency{a ■ d, i, f) A 

actor {a ■ d) £T A then 
5: a = actor{a ■ d) 

6 : i' i Apr' d) 

7: f'-^fApostia-d) 

8: Act' ^ Act \Ja ■ d 

9: TA' ^TA\ {a} 

10: S search(i', f', Act',T A') 

11: for all S'i G 5 do 

12: Si SiVJ a ■ d 

13: return S 

14: else 

15: return 0 


Approximate Reconstruction Function (Algorithm |^. This 
function calls the function ApproximateSearch to search 
the applicable actions per each target agent (line 2). Then, 
the list of applicable actions per each agent is checked (lines 
3-12). Specifically, if an agent may have executed one action 
only, then the NM knows that this action was executed and 
it updates the reconstructed action set (lines 3-5). Then, the 
set of observed actions and the knowledge about the initial 
and final states is updated (lines 6-12). Finally, discovered 
violations and fulfilments are calculated (lines 14-19). The 
temporal cost of this algorithm is given by the cost of the 
ApproximateSearch function discussed below. 

Algorithm contains the pseudocode of the 

ApproximateSearch function. It starts by initialising 
the list of applicable actions per agent (lines 3-4). Then it 
calculates the set of instances that are relevant to the initial 
state (line 7). The function calculates per each target agent 
the list of applicable actions that it may have executed (lines 
9-11). Then, the list of applicable actions per each agent 
is checked (lines 12-18). Specifically, if an agent may have 
executed one action only, then the NM knows that this action 


was executed and it updates the list of applicable actions, 
the initial and final states, and retracts the agent from the 
target agents (lines 14-17). This process is repeated until 
there are no more target agents or the initial and final states 
remain unchanged. Then the set of instances that are relevant 
to the initial state is calculated (line 13). Finally, the list 
of applicable actions per agent is updated with actions of 
remaining target agents (lines 19-20). The temporal cost of 
this function is 0{\Ag\'^ x |i9| x IjA). 

Algorithm 4 Approximate Reconstruction Function 
1: function ApproximateReconstructionIi, /, Acl, TA, T) 

2: S <r- approximates ear ch{i, f, Act,T A) 

3: for all a G TA do 

4: if IS'cl = 1 then 

5: R i — R U Sa 

6: if 7? / 0 then 

7: Aci t- Act U R 

8: 7 t—i/\pre(7?) 

9: if |Ac7| < \Ag\ then 

10: ff APOSt{R) Ai° 

li ^^^^f^TAfAeffiAct) 

13: 0,P <r- calculate!nstances{i) 

14: for all a G TA do 

15: if 1501 > 1 then 

16: if Ao- ^ Sa : -<forbidden{P, a) then _ 

17: D D Ua t> a is an action from Sa 

18: else if G Sa ■ -<mandatory{0, a) then _ 

19: 77 77 U a t> a is an action from Sa 


Algorithm 5 Approximate Search Function 

1: function ApproximateSearch( 7,/, Act, TA) 

2 : continue ■<— true 

3: for all a G TA do 

4: So, t— 0 > List of approximate actions per agent 

5: while continue A TA 7 ^ 0 do 

6 : continue ■«— false 

7: for all a G TA do 

8 : La-^ih 

9: for all d G 77 do 

10: if 3a : checkActionConsitency{a ■ d, i, f) A 

actor (a ■ d) £ TA then 

ii- Ractor{a-d) ^ R actor (a-d) U C • (7 

12: for all a G TA do 

13: if \La\ = 1 then 

14: Sa ^— La 

15: i^iApre{La) 

16: f ^ f Apost(La) 

17: TA^TA\ {a} 

18: continue t— true 

19: for all a G TA do 

20 : Sa ^— La 

21: return S 


V. Evaluation 

This section compares the performance of a NM with full 
reconstruction, a NM with approximate reconstruction and a 
traditional norm monitor —which is the method used in the 
majority of previous proposals ||6|, 1^ . iflAll . ll24l . ISl — that 
only considers the observed actions to detect violations; with 
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respect to their capabilities to monitor norm compliance. We 
have evaluated our proposal in a case study, which allows 
us to contextualise the results and to give a meaningful 
interpretation to them; and in a series of random experiments, 
which allow us to evaluate our proposal under a wide range 
of different situations and parameter values. 

A. Case Study 

We implemented in Java a simulator of the paper example 
in which robots attend requests in offices connected through 
corridors. Compliance with the collision avoidance norm is 
controlled by a monitor that observes surveillance cameras. 
In each simulation, we generate corridors and cameras ran¬ 
domly. In each step of the simulation, each robot chooses 
randomly one applicable action to be executed. The simulation 
is executed 100 steps and repeated 100 times to support the 
findings. We conducted experiments in which the number of 
offices O took a random value within the |3, 500] interval 
and the number of robots R took a random value within the 
|2, 250] interval. Besides that, to be able to compare with the 
full NM, we also considered small scenarios only, in which the 
number of offices O takes a random value within |3,10] and 
the number of robots R takes a random value within |2, 5|, 
as the full reconstruction has an exponential cost and it is 
intractable for most of the cases with the default intervals. 

1) Action Observability: To analyse the performance and 
scalability of monitors with respect to their capabilities to 
observe actions, we defined the number of corridors C as 
a random value within the |0, O x {O — 1)| interval and 
varied the ratio of cameras to corridors (action observability). 
Table shows the percentage of violations detected per each 
type of monitor. The higher the ratio of cameras, the more 
actions are observed and the better the performance of all 
monitors. Moreover, the approximate NM offers on average 
a 39% performance improvement over a traditional monitor 
(i.e., it identifies 16% more violations plus a further 24% 
of discovered violations). That is, an approximate NM out¬ 
performs a traditional monitor with the same capabilities to 
observe actions. When compared to full NM in small scenarios 
(O G |3,10] and R G |2,5]), approximate NM performs sim¬ 
ilarly. This is explained by the fact that there is a single norm 
in this scenario, actions have no concurrency conditions, and 
the preconditions and postconditions of actions are disjoint. 
In this circumstances, the approximate reconstruction process 
reconstructs actions similarly to the full recons tructiorpq 


Cameras 

Ratio 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Cameras 

Ratio 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

0% 

0% 

0+0% 

0% 

0% 

0% 

0-r0% 

20% 

11% 

14+9% 

20% 

16% 

32% 

32-t6% 

40% 

31% 

40+12% 

40% 

32% 

68% 

67-t5% 

60% 

55% 

71 + 10% 

60% 

56% 

88% 

8S-t3% 

80% 

78% 

91+4% 

80% 

76% 

99% 

99-t0% 

100% 

100% 

100+0% 

100% 

100% 

100% 

100+0% 

<2 e 

3, 50011 and R G |12, 250]| 

O G p, 10]| 

and R G 112, 5J| 


TABLE I; Action Observability Experiment 


’^Note that full reconstruction does not guarantee completeness. 


2) Action Instantiations: To analyse the performance and 
scalability of monitors with respect to agent capabilities to 
execute actions (i.e., the number of instantiations per action), 
we varied the ratio of corridor^ (e g-, a ratio of 0% means 
C = O) and defined the number of cameras as a random value 
within the |0, C| interval. Table shows the results of this 
experiment. The approximate NM offers on average a 43% 
performance improvement over a traditional monitor (i.e., it 
identifies 29% more violations plus a further 14% of discov¬ 
ered violations). That is, given the same number of possible 
instantiations per action, an approximate NM outperforms a 
traditional monitor. Besides, we can see that, as in the previous 
experiment, the approximate NM performs similarly to the 
full NM. In particular, when the ratio of corridors is higher 
than 0%, agents are capable of executing different actions 
and the reconstruction process becomes more complex, which 
decreases the performance of full and approximate NMs. 
However, full and approximate NMs noticeably outperform 
the traditional monitor regardless of the ratio of corridors. 


Corridors 

Ratio 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Corridors 

Ratio 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

0% 

51% 

98-r0% 

0% 

52% 

99% 

99+0% 

20% 

48% 

55+6% 

20% 

59% 

80% 

79+3% 

40% 

48% 

56+7% 

40% 

55% 

74% 

74+3% 

60% 

41% 

47+6% 

60% 

51% 

70% 

69+4% 

80% 

49% 

56+13% 

80% 

55% 

68% 

68+5% 

100% 

42% 

49+7% 

100% 

57% 

70% 

69+4% 


O G [3, 5001 and R G [2, 250] O G [3, 10] and R G [2., 5] 


TABLE II; Action Instantiations Experiment 


B. Random Experiments 

We implemented a simulator in Java in which there is a set 
of agents that perform actions in a monitored environment as 
defined below. In particular, our simulator does not model a 
specific scenario; rather it creates a different scenario in each 
simulation (i.e., generating randomly agent capabilities, the 
environment properties, actions and norms). As in the previous 
experiments, we have considered big and small scenarios. In 
particular, the number of agents G in small scenarios took 
a random value within the |1,5| interval, whereas in big 
scenarios G took a random value within the |1, 500] interval. 
The number of actions A took a random value within the 
|1,50| interval. Again, the simulation is executed 100 steps 
and repeated 1000 times to ensure that the values of the 
simulation parameters range over possible valuej^ 

Agent Definition. We modelled different types of agents 
with different capabilities to perform actions. In particular, 
the set of actions available to each agent depends on the 
function/s assumed by each agent in a particular simulation. 
To model these capabilities, a set of roles is created at the 
beginning of each simulation. Specihcally, the number of roles 
created took a random value within the |1, A] interval. Eor 
each role a subset of the actions are randomly selected as the 
role capabilities; i.e., all agents enacting this role are able to 

’^Recall that C takes values within the [O, O X (O — 1)] interval. 

*®Note that in the random experiments there are more simulation parameters 
than in the case-study simulator and a higher number of repetitions is required 
to support the findings. 
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perform these action^^ To avoid that all roles have similar 
capabilities, which would lead to simulations populated by 
homogeneous agents, the number of actions selected as role 
capabilities took a random value within the |1, [0.1 * A]] 
interval (i.e., at maximum each role is capable of performing a 
10% of the actions). At the beginning of each simulation, each 
agent is defined as enacting a random subset of the roles. In 
each step of the simulation, each agent selects randomly one 
action among the available actions that it can execute in the 
current state. 

Environment Definition. In the simulator, the environment 
is described in terms of different situations or states of affairs 
that can be true or false. Each one of these states of affairs is 
represented by a grounded proposition. Thus, the state of the 
environment is defined in terms of a set of propositions. For 
simplicity, we assumed that these propositions are independent 
(i.e, propositions are not logically related). In our simulations, 
the number of propositions P took a random value within the 
|A, 2 * A] interval (i.e., there is at least one proposition per 
each actiorp^. Besides that, there is a set of grounded atomic 
formulas describing the roles played by agents and the actions 
that can be performed by each role. The relationship between 
agents and roles is formally represented by a binary predicate 
(play). Specifically, the expression play{g, r) describes the 
fact that the agent identified by g enacts the role identified by 
r. Similarly, relationship between roles and actions is formally 
represented by a binary predicate (capable). Specifically, the 
expression capable{a, r) describes the fact that agents enacting 
role r are capable of performing the action identified by a. For 
simplicity, we assume that the roles enacted by the agents and 
the role capabilities are static properties of the environment. 

Action Definition. Actions allow agents to change the state 
of the environment. At the beginning of each simulation, 
a set of actions is randomly generated. For each action 
{name, pre, con, post) the elements are defined as follows: 
name is initialised with a sequential identifier a; pre is defined 
as {play{A, R),capable{R,a),pi, ...,pn} where the elements 
pi, ...,Pn are randomly selected from the proposition set; con 
is defined as {ai{Ai, Ri), ...,amiAm, Rm)}, where each 
is an action randomly selected from the action set such that 
ai a and Ai,Ri are free variables representing the agent 
performing the action and the role capable of performing this 
action, respectively; and post is defined as {pi, ...,pk} where 
each Pi is a proposition randomly selected from the proposition 
set. To avoid that actions have too many constraints, which 
would be unrealistic and make actions to be only executed 
on few situations, the number of propositions in pre and 
post takes a random value within the |1, [0.1 * P]] interval. 
Similalry, the number of actions in con takes a random value 
within the |0, [0.1 * A]] interval. 

Besides these actions, a NOP action, which has no effect 
on the environment, was created. To maximise the number 
of actions executed in the simulations, which may entail 
more violations and fulfilments, we defined that the NOP 
action can only be executed by agents when none of their 


available actions can be executed. However, similar results 
would have been obtained if this condition was relaxed^ Our 
simulator models scenarios where the NOP action can always 
be observed. This is the case in many real domains such as 
Intrusion Detection Systems or Autonomous Systems, where 
it it not always possible to analyse the data (e.g., the packages) 
sent by agents (e.g., hosts) to infer the actions performed, but 
it is always possible to know which agents have performed an 
action (i.e., which agents have sent packages). 

Norm Definition. Agents’ actions are regulated by a set 
of norms. At the beginning of each simulation, a set of 
norms is randomly created. In particular the number of 
norms took a random value within the |1, A] (i.e., there 
is at maximum one norm per each action). For each ac¬ 
tion {deontic, condition, action) the elements are defined 
as follows: deontic is randomly initialised with a deontic 
operator; condition is defined as {pi, ...,pk} where each pi is 
a proposition randomly selected from the proposition set; and 
action is randomly initialised with an action. To allow norms 
to be instantiated, the number of propositions in condition 
takes a random value within the |0, [P * 0.1]] interval. 

1) Action Observability: To analyse the performance and 
scalability of monitors with respect to their capabilities to 
observe actions, we varied the observation probability. Tables 


III and IV show the percentage of detected fulfilments and 


violations, respectively. Again, the approximate NM offers 
a significant performance improvement over a traditional 
monitor; i.e., the approximate NM offers on average a 74% 
performance improvement over a traditional monitor. When 
compared to full NM in small scenarios (A G |1,50| and 
G G |1, 5|), the full NM offers on average a 21% performance 
improvement over an approximate NM. This is explained by 
the fact that this experiment is more complex than the case 
study; i.e., there are several norms (both prohibition and obli¬ 
gation norms), actions have concurrent conditions and actions 
may have conflicting preconditions and postconditions (i.e., 
conditions that are defined over the same propositions). Note 
that the traditional monitor detects violations and fulfilments 
even when the observation probability is 0%. These detections 
correspond to situations in which none of the agents can 
execute any action (i.e., all agents execute the NOP action) 
which leads to the fulfilment of prohibition instances and the 
violation of obligation instances. This phenomenon is more 
frequent in case of small scenarios since the lower the number 
of agents, the higher the probability that all agents cannot 
execute any action. 

2) Action Possibilities: To analyse the performance and 
scalability of monitors with respect to agent capabilities to 
execute actions (i.e., the number of available actions), we 
defined the observation probability as a random value within 
the [0,100%] interval and we varied the number of actions. 
Tables |V| and |Vl| show the percentage of detected fulfilments 
and violations, respectively. In this experiment, the more 
actions, the more complex the reconstruction problem is. As 
a consequence, the improvement offered by an approximate 


'’This condition has been formulated in action preconditions as explained 
below. 

*^Note that an action can change the truth value of several propositions. 


^^Note that the capabilities of monitors to detect violations and fulfilments 
do not depend on the fact that agents are allowed to perform the NOP action 
in any situation. 
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Observ. 

Prob. 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Observ. 

Prob. 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

0% 

2% 

35+7% 

0% 

20% 

46% 

34+1% 

20% 

18% 

47+6% 

20% 

23% 

51% 

39+1% 

40% 

35% 

62+4% 

40% 

31% 

57% 

45+1% 

60% 

50% 

72+3% 

60% 

43% 

69% 

56+0% 

80% 

66% 

80+2% 

80% 

58% 

79% 

70+0% 

100% 

100% 

100+0% 

100% 

100% 

100% 

100+0% 

A G 

[1, 50JI and 

G G [1,5001 

A G [1,50 

and G G [[I, 5J| 


TABLE III; Fulfilments Detected in the Action Observability 
Experiment 


Observ 

Prob. 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Observ 

Prob. 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

0% 

2% 

36+8% 

0% 

20% 

44% 

33+1% 

20% 

18% 

50+6% 

20% 

23% 

50% 

37+0% 

40% 

35% 

62+6% 

40% 

32% 

56% 

45+0% 

60% 

49% 

70+2% 

60% 

42% 

68% 

55+0% 

80% 

65% 

78+2% 

80% 

56% 

78% 

69+0% 

100% 

100% 

100+0% 

100% 

100% 

100% 

100+0% 


A e [1,501 andG G [1,5001 


A G [1,501 and G G [1,51 


TABLE IV: Violations Detected in the Action Observability 
Experiment 


and observability). Both in the case study and in the ran¬ 
dom experiments our algorithms improved significantly 
the percentage of violations and fulfilments detected. 

2) Approximate reconstruction is slighting less effective 
than full reconstruction. In the case study, where a single 
prohibition norm was monitored; the approximate NM 
obtained almost the same results as the full NM. In 
our random experiments, where several prohibition and 
obligation norms were monitored, the full NM offered 
an average improvement of a 18% over an approximate 
NM. 

3) Approximate reconstruction is scalable with the scenario 
size (i.e., the number of agents and actions to be mon¬ 
itored). In particular, our experiments demonstrate that 
the approximate algorithm can be used to monitor a large 
number of agents (we simulated scenarios with up to 500 
agents), actions (we simulated scenarios with up to 128 
actions), and norms (we simulated scenarios with up to 
128 norms). 


NM over a traditional monitor decreases as the number of 
actions increases. However, the approximate NM still offer on 
average a 56% performance improvement over a traditional 
monitor. When the number of actions is very high (e.g., when 
the number of actions is 128 in small scenarios), then action 
preconditions become very complex and most of the times the 
NOP action is executed by all agents, which entails that the all 
monitors obtain a good performance. We can see that, as in the 
previous experiment, the approximate NM performs slightly 
worse than the full NM (i.e., the full NM offers on average 
a 15% performance improvement over an approximate NM). 
However, full and approximate NMs noticeably outperform 
the traditional monitor regardless of the number of actions. 


Actions 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Actions 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

2 

52% 

95+8% 

2 

69% 

99% 

95+3% 

8 

49% 

75+1% 

8 

47% 

75% 

71+0% 

32 

32% 

48+0% 

32 

38% 

63% 

48+0% 

128 

29% 

40+0% 

128 

53% 

83% 

57+0% 


GG [1,5001 GG[1.51 

TABLE V; Fulfilments Detected in the Action Possibilities 
Experiment 


Actions 

Traditional 

Monitor 

Approximate NM 
Identify+Discover 

Actions 

Traditional 

Monitor 

Full 

NM 

Approximate NM 
Identify+Discover 

2 

53% 

93+5% 

2 

68% 

99% 

95+2% 

8 

49% 

76+6% 

8 

48% 

76% 

72+1% 

32 

35% 

50+1% 

32 

38% 

62% 

47+0% 

128 

31% 

39+0% 

128 

56% 

85% 

59+0% 


GG [1,5001 Gg[1,51 

TABLE VI; Violations Detected in the Action Possibilities 
Experiment 


C. Summary 

The conclusions of our evaluation are threefold: 

1) Both approximate and full reconstruction processes are 
more effective (i.e., detect more norm violations and ful¬ 
filments) than traditional monitoring approaches regard¬ 
less of the scenario complexity (i.e., action possibilities 


VI. Related Work 

Previous work on norms for regulating MAS proposed 
control mechanisms for norms to have an effective influence on 
agent behaviours na . These control mechanisms are classified 
into two main categories US: regimentation mechanisms, 
which make the violation of norms impossible; and enforce¬ 
ment mechanisms, which are applied after the detection of 
norm violations and fulfilments, reacting upon them. 

Regimentation mechanisms prevent agents from performing 
forbidden actions (vs. force agents to perform obligatory 
actions) by mediating access to resources and the commu¬ 
nication channel, such as Electronic Institutions (Els) 02. 
However, the regimentation of all actions is often difficult or 
impossible. Furthermore, it is sometimes preferable to allow 
agents to make flexible decisions about norm compliance 
Q. In response to this need, enforcement mechanisms were 
developed. Proposals on the enforcement of norms can be 
classified according to the entity that monitors whether norms 
are fulfilled or not. Specifically, norm compliance can be 
monitored by either agents themselves or the underlying 
infrastructure may provide monitoring entities. 

Regarding agent monitoring, this approach is characterized 
by the fact that norm violations and fulfilments are monitored 
by agents that are involved in an interaction Eol, a or other 
agents that observe an interaction in which they are not directly 
involved ED, cni, OH. The main drawback of proposals 
based on agent monitoring is the fact that norm monitoring 
and enforcement must be implemented by agent programmers. 

Regarding infrastructural monitoring, several authors pro¬ 
posed developing entities at the infrastructure level that are in 
charge of both monitoring and enforcing norms. Cardoso & 
Oliveira ||6l proposed an architecture in which the monitoring 
and enforcement of norms is made by a single institutional 
entity. This centralized implementation represents a perfor¬ 
mance limitation when dealing with a considerable number 
of agents. To address the performance limitation of central¬ 
ized approaches, distributed mechanisms for an institutional 
enforcement of norms were proposed in Ea, ca, M, ID- 
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All of the aforementioned proposals on monitoring assume 
that monitors have complete observational capabilities. Excep¬ 
tion to these approaches is the recent work of Bulling et al. 
0 and Alechina et al. El. In 0, the partial observability 
problem is addressed combining different norm monitors to 
build ideal monitors (i.e., monitors that together are able to 
detect the violation of a given set of norms). In 0, the authors 
propose to synthesise an approximate set of norms that can be 
monitored given the observational capabilities of a monitor. 
However, there are circumstances in which norms cannot be 
modified (e.g., contract and law monitoring) or ideal monitors 
are expensive and/or not feasible. We take a different approach 
in which norms and monitors’ observation capabilities remain 
unchanged and monitors reconstruct unobserved actions. 

Our approach is also related to planning, where methods 
(e.g., POMDPs ifTSll I for choosing optimal actions in partially 
observable environments have been proposed. A major differ¬ 
ence between these proposals and our proposal is that NMs 
do not perform practical reasoning, i.e., they do not try to 
optimise or achieve a practical goal. Instead, NMs perform 
both deductive and abductive reasoning ||25]| to reason from 
observed actions to reach a conclusion about the state of the 
world, and to infer unobserved actions. 

VII. Conclusion 

In this paper, we propose information models and algo¬ 
rithms for monitoring norms under partial action observability 
by reconstructing unobserved actions from observed actions 
using two different reconstruction processes: full and ap¬ 
proximate. Our experiments demonstrate that both reconstruc¬ 
tion processes detect more norm violations than traditional 
monitoring approaches. Approximate reconstruction performs 
slightly worse than full reconstruction, whereas its computa¬ 
tional cost is much cheaper, making it suitable to be applied 
in practice. 

The reconstruction algorithms proposed in this paper can 
be applied to several domains that require action monitoring; 
from normative MAS ED, to intrusion detection systems na, 
to control systems 12^ and to intelligent surveillance systems 
CD. As future work, we plan to investigate domain-dependent 
approximations that could speed up action reconstruction even 
further. 
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